Variational quantum compiling with double Q-learning
نویسندگان
چکیده
Quantum compiling aims to construct a quantum circuit V by gates drawn from native gate alphabet, which is functionally equivalent the target unitary U. It crucial stage for running of algorithms on noisy intermediate-scale (NISQ) devices. However, space structure exploration enormous, resulting in requirement human expertise, hundreds experimentations or modifications existing circuits. In this paper, we propose variational (VQC) algorithm based reinforcement learning (RL), order automatically design VQC with no intervention. An agent trained sequentially select alphabet and qubits they act double Q-learning \epsilon-greedy strategy experience replay. At first, randomly explores number circuits different structures, then iteratively discovers structures higher performance task. Simulation results show that proposed method can make exact compilations less compared previous algorithms. reduce errors due decoherence process noise NISQ devices, enable especially complex be executed within coherence time.
منابع مشابه
Deep Reinforcement Learning with Double Q-Learning
The popular Q-learning algorithm is known to overestimate action values under certain conditions. It was not previously known whether, in practice, such overestimations are common, whether this harms performance, and whether they can generally be prevented. In this paper, we answer all these questions affirmatively. In particular, we first show that the recent DQN algorithm, which combines Q-le...
متن کاملDouble Q-learning
In some stochastic environments the well-known reinforcement learning algorithm Q-learning performs very poorly. This poor performance is caused by large overestimations of action values. These overestimations result from a positive bias that is introduced because Q-learning uses the maximum action value as an approximation for the maximum expected action value. We introduce an alternative way ...
متن کاملWeighted Double Q-learning
Q-learning is a popular reinforcement learning algorithm, but it can perform poorly in stochastic environments due to overestimating action values. Overestimation is due to the use of a single estimator that uses the maximum action value as an approximation for the maximum expected action value. To avoid overestimation in Qlearning, the double Q-learning algorithm was recently proposed, which u...
متن کاملEvaluating project’s completion time with Q-learning
Nowadays project management is a key component in introductory operations management. The educators and the researchers in these areas advocate representing a project as a network and applying the solution approaches for network models to them to assist project managers to monitor their completion. In this paper, we evaluated project’s completion time utilizing the Q-learning algorithm. So the ...
متن کاملEvaluating project’s completion time with Q-learning
Nowadays project management is a key component in introductory operations management. The educators and the researchers in these areas advocate representing a project as a network and applying the solution approaches for network models to them to assist project managers to monitor their completion. In this paper, we evaluated project’s completion time utilizing the Q-learning algorithm. So the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: New Journal of Physics
سال: 2021
ISSN: ['1367-2630']
DOI: https://doi.org/10.1088/1367-2630/abe0ae